Scalable Preference Learning from Data Streams

نویسندگان

  • Fabon Dzogang
  • Thomas Lansdall-Welfare
  • Saatviga Sudhahar
  • Nello Cristianini
چکیده

We study the task of learning the preferences of online readers of news, based on their past choices. Previous work has shown that it is possible to model this situation as a competition between articles, where the most appealing articles of the day are those selected by the most users. The appeal of an article can be computed from its textual content, and the evaluation function can be learned from training data. In this paper, we show how this task can benefit from an efficient algorithm, based on hashing representations, which enables it to be deployed on high intensity data streams. We demonstrate the effectiveness of this approach on four real world news streams, compare it with standard approaches, and describe a new online demonstration based on this technology.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model

Artificial Immune System (AIS) models hold many promises in the field of unsupervised learning. However, existing models are not scalable, which makes them of limited use in data mining. We propose a new AIS based clustering approach (TECNO-STREAMS) that addresses the weaknesses of current AIS models. Compared to existing AIS based techniques, our approach exhibits superior learning abilities, ...

متن کامل

Scalable e-Learning Multimedia Adaptation Architecture

A neglected challenge in existing e-Learning (eL) systems is providing access to multimedia to all users regardless of environmental conditions such as diverse device capabilities, the heterogeneity of the underlying IP network, and user modality preference. This paper proposes a novel two-tier transcoding framework capable of adapting eL multimedia to meet the end-user environmental challenges...

متن کامل

Mining Low Dimensionality Data Streams of Continuous Attributes

This paper presents an incremental and scalable learning algorithm in order to mine numeric, low dimensionality, high–cardinality, time–changing data streams. Within the Supervised Learning field, our approach, named SCALLOP, provides a set of decision rules whose size is very near to the number of concepts to be extracted. Experimental results with synthetic databases of different complexity d...

متن کامل

Collaborative Context-aware Preference Learning

Preference learning methods work by exploiting patterns in the data that relate users to items. Preference data often includes information such as the context of a recommendation (e.g. time/date, location). Leveraging this data (e.g. click logs, purchase/usage data) can significantly improve the relevance and quality of the recommendation. In this work we introduce a novel scalable context-awar...

متن کامل

Jubatus: An Open Source Platform for Distributed Online Machine Learning

Distributed computing is essential for handling very large datasets. Online learning is also promising for learning from rapid data streams. However, it is still an unresolved problem how to combine them for scalable learning and prediction on big data streams. We propose a general computational framework called loose model sharing for online and distributed machine learning. The key is to shar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015